Speaker Identification Using Multiband Linear Predictive Code
نویسنده
چکیده
This paper presents an effective method for improving the performance of speaker identification system based on the multiresolution properly of the wavelet transform, the input speech signal is decomposed into L subbands. To capture the characteristic of the vocal tract, the liner prediction code of each band (including the linear predictive code (LPC)for full band) are calculated. The feature recombination schemes combines the LPC of each band and LPC for full band in single feature vector then the Euclidean distance measure is used to perform the similarity measure between the test and reference speech. Experimental results shows that the proposed method achieve better performance than speaker identification using LPC and real cepstral coefficients. لا ةصلاخ ليوحت صئاصخ ىلع دامتعلااب صخشلا فيرعت ةموظنم ءادأ نيسحتل ةلاعف ةقيرط ليثمت مت ثحبلا اذه يف ليلحتلا ةددعتملا ةجيوملا . ىلا ةلخادلا ملاكلا ةراشأ ليلحت مت L مزحلا نم . مت ةيتوصلا لابحلا صئاصخ ىلع لوصحلل يطخلا نيمختلا ةرفشم مدختسأ (LPC) ) يمختلا اهنمض نم ةلماكلا ةمزحلل يطخلا ن .( ـل ةزيمملا تافصلا جمد مت ) LPC ( عم ةمزح لكل LPC سايقم مادختسأ مت كلذ دعبو دحاو هجتم يف ةلماكلا ةمزحلل ةفاسملا ) Euclidean ( ةربتخملا ةراشلااو ةيعجرملا ةراشلاا نيب هباشتلا سايقل . مادختسأ نا رابتخلاا جئاتن تحضو اتن تطعأ ةحرتقملا ةقيرطلا مادختسأب زييمتلا ةموظنم نم لضفأ جئ LPC و Real Cepstral Coefficients . IJCCCE, VOL.7, NO.2, 2007 Speaker Identification Using Multiband Linear Predictive Code 2 1Introduction Speaker recognition is the process of automatically recognizing who is speaking on the basis of individual information included in speech waves. This technique makes it possible to use the speaker’s voice to verify his identity and control access to services such as voice dialing, banking by telephone, telephone shopping, database access services, information services, voice mail, security control for confidential information areas and remote access to computers [1]. Speaker recognition can be classified into identification and verification. Speaker verification refers to the process of determining whether or not the speech samples belong to some specific speaker. On the other hand, Speaker identification is the process of determining which registered speaker provides a given utterance (word or phrase). Speaker recognition methods can also be divided into text-independent and textdependent methods. In a text-independent system, speaker models capture characteristics of what one is saying, while in a text-dependent system the recognition of the speaker’s identity is based on his or her speaking one or more specific phrases, like passwords, card numbers, etc [2]. Many researches have been done on the feature extraction of speech. The linear predictive code (LPC) was used because of their simplicity and effectiveness in speaker recognition [3]. Other widely used feature parameters, namely, cepstral coefficients. Cepstral coefficients and their time derivatives are used as features in order to capture dynamic information and eliminate time-invariant spectral information that is generally attributed to the interposed communication channel [4]. In this paper, the multiband linear predictive code ( MBLPC ) is used in speaker identification system. This method is based on the multiresolution of the wavelet transform. The input speech signal is decomposed into L subband then the linear predictive code of each band (including the LPC for full band) are calculated. The feature recombination and distance measure methods are used to evaluate the task of speaker identification. This paper is organized as follows. Feature extraction is described in section 2. Distance measure is described in section 3. Section 4 presents the multiband speaker identification model. Experimental results are presented in section 5. Concluding remarks are made in section 6. 2Feature Extraction 2-1 Linear Predictive coding (LPC): [5] One of the most powerful speech analysis techniques is the method of linear predictive analysis. This method has become the predominant technique for estimating the basic speech parameters, e.g., pitch, formants, spectra, vocal tract area functions and for representing speech for low bit rate transmission or storage. The importance of this method lies both in its ability to provide the speed and extremely accurate estimates of the computation. The basic idea behind LPC analysis is that a speech sample can be approximated as a linear combination of past speech samples. By minimizing the sum of the squared differences (over a finite interval) between the actual speech samples and the linearly predicted ones. It is assumed that the variations with time of the vocal tract shape can be approximated with sufficient accuracy by a secession of stationary shapes. It is possible to define an all-pole transfer function H(z) that produces the output speech s(n) given the IJCCCE, VOL.7, NO.2, 2007 Speaker Identification Using Multiband Linear Predictive Code 3 input excitation u(n) (either an impulse or random noise) is given by:
منابع مشابه
Multiband Approach to Robust Text-independent Speaker Identification
This paper presents an effective method for improving the performance of a speaker identification system. Based on the multiresolution property of the wavelet transform, the input speech signal is decomposed into various frequency bands in order not to spread noise distortions over the entire feature space. To capture the characteristics of the vocal tract, the linear predictive cepstral coeffi...
متن کاملAnalysing the performance of Speaker Verification task using different features pdfkeywords=Mel Frequency Cepstral Coefficient(MFCC), Linear Predictive Cepstral Coefficient(LPCC), Perceptual Linear Predictive(PLP), Equal Error Rate(EER)
Speaker recognition is the identification of the person who is speaking by characteristics of their voices, also called “voice recognition”. The components of Speaker Recognition includes Speaker Identification(SI) and Speaker Verification(SV). Speaker identification is the task of determining an unknown speakers identity. If the speaker claims to be of a certain identity and the voice is to ve...
متن کاملSignificance of formants from difference spectrum for speaker identification
In this paper, we describe a prototype speaker identification system using auto-associative neural network (AANN) and formant features. Our experiments demonstrate that formants extracted from difference spectrum perform significantly better than formants extracted from normal spectrum for the task of speaker identification. We also demonstrate that formants from difference spectrum provide com...
متن کاملOn the Usefulness of Linear and Nonlinear Prediction Residual Signals for Speaker Recognition
This paper compares the identification rates of a speaker recognition system using several parameterizations, with special emphasis on the residual signal obtained from linear and nonlinear predictive analysis. It is found that the residual signal is still useful even when using a high dimensional linear predictive analysis. On the other hand, it is shown that the residual signal of a nonlinear...
متن کاملIntegrating Complementary Features from Vocal Source and Vocal Tract for Speaker Identification
This paper describes a speaker identification system that uses complementary acoustic features derived from the vocal source excitation and the vocal tract system. Conventional speaker recognition systems typically adopt the cepstral coefficients, e.g., Mel-frequency cepstral coefficients (MFCC) and linear predictive cepstral coefficients (LPCC), as the representative features. The cepstral fea...
متن کامل